Scalable I/O and analytics
نویسندگان
چکیده
High-performance computing systems have already approached peta-scale with hundreds of thousands of processors/cores in many deployments. These systems promise a new level of predictive and knowledge discovery ability as researchers gain the capability to model dependencies between phenomena at scales not seen earlier. These applications are highly I/O and data intensive, leading scientists to observe that performing I/O and subsequent analyses are major bottlenecks in effectively utilizing peta-scale systems and a major hurdle in accelerating discoveries. Although significant progress has been made in performance, interfaces, and middleware runtime systems for I/O in the recent past, significantly more research and development needs to be carried out to scale the performance to the desired levels for systems containing tens to hundreds of thousands of cores. In this work we outline our recent achievements and current research for designing scalable I/O software and enabling data analytics in storage systems. We also enumerate key challenges for the I/O systems and discuss ongoing efforts that address these challenges.
منابع مشابه
VSFS: A Versatile Searchable File System for HPC Analytics
Emerging HPC analytics applications urgently demand filesearch services to drastically reduce the scale of the input data in real-time, so that the speed of computation and data analytics can be greatly accelerated. Unfortunately, the existing file-search solutions are either poorly scalable for large-scale systems, or lack a well-integrated interface to allow applications to easily use them fo...
متن کاملAn Interaction Based Composable Architecture for Building Scalable Models of Large Social, Biological, Information and Technical Systems.
High-resolution scalable models of complex socio-technical systems; i. Service-oriented architecture and delivery mechanism for facilitating the use ii. of these models by domain experts; Distributed coordinating architecture for information fusion, model execution iii. and data processing; and Scalable data management architecture and system to support model execution iv. and analytics Scalabl...
متن کاملTen Research Questions for Scalable Multimedia Analytics
The scale and complexity of multimedia collections is ever increasing, as is the desire to harvest useful insight from the collections. To optimally support the complex quest for insight, multimedia analytics has emerged as a new research area that combines concepts and techniques from multimedia analysis and visual analytics into a single framework. State of the art multimedia analytics soluti...
متن کاملScalable Scientific Computing Algorithms Using MapReduce
Cloud computing systems, like MapReduce and Pregel, provide a scalable and fault tolerant environment for running computations at massive scale. However, these systems are designed primarily for data intensive computational tasks, while a large class of problems in scientific computing and business analytics are computationally intensive (i.e., they require a lot of CPU in addition to I/O). In ...
متن کاملGoal-based composition of scalable hybrid analytics for heterogeneous architectures
Crafting scalable analytics in order to extract actionable business intelligence is a challenging endeavour, requiring multiple layers of expertise and experience. Often, this expertise is irreconcilably split between an organisation’s engineers and subjectmatter domain experts. Previous approaches to this problemhave relied on technically adept users with tool-specific training. Such an approa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009